Quantifying the effects of practicing a semantic task according to subclinical schizotypy

The learning ability of individuals within the schizophrenia spectrum is crucial for their psychosocial rehabilitation. When selecting a treatment, it is thus essential to consider the impact of medications on practice effects, an important type of learning ability. To achieve this end goal, a pre-treatment test has to be developed and tested in healthy participants first. This is the aim of the current work, which takes advantage of the schizotypal traits present in these participants to preliminary assess the test’s validity for use among patients. In this study, 47 healthy participants completed the Schizotypal Personality Questionnaire (SPQ) and performed a semantic categorization task twice, with a 1.5-hour gap between sessions. Practice was found to reduce reaction times (RTs) in both low- and high-SPQ scorers. Additionally, practice decreased the amplitudes of the N400 event-related brain potentials elicited by semantically matching words in low SPQ scorers only, which shows the sensitivity of the task to schizotypy. Across the two sessions, both RTs and N400 amplitudes had good test–retest reliability. This task could thus be a valuable tool. Ongoing studies are currently evaluating the impact of fully deceptive placebos and of real antipsychotic medications on these practice effects. This round of research should subsequently assist psychiatrists in making informed decisions about selecting the most suitable medication for the psychosocial rehabilitation of a patient.

psychometric self-questionnaires 27 and the absence of confounding effects of disease chronicity and previous exposure to antipsychotic medications 28 .
Investigating the effects of practice in a laboratory setting using experimental psychology methods may be valuable for better specifying and understanding the cognitive changes and learning deficits observed within the spectrum of schizophrenia.Effects of practice, which refer to the spontaneous learning that naturally occurs between two testing sessions, reflect various cognitive abilities, such as procedural memory and the development of test-taking strategies.Such abilities appear to be essential for optimizing performance in many daily activities [29][30][31] .In healthy participants, practice can effectively remedy poor cognitive performance.Such natural practice effects differ from what is often called learning potential (LP) 5 , which involves improvements induced by training interventions that occur between the test and retest sessions (i.e., test-train-test approaches) 32,33 .In a meta-analysis 34 , scores in executive function and attention tasks were found to improve over time in healthy subjects, while no significant improvement was found in individuals within the schizophrenia spectrum.Another meta-analysis 35 also reported that during the course of the disorder, such patients did not improve in performance across repetitions of cognitive tests compared to the control group.Similar abnormalities in practice effects have been found in subclinical people with schizotypy, such as poorer performance at the retest session than at the test session in tasks such as continuous performance 36 , Wisconsin card sorting, and verbal fluency tests 37 .This lack of practice effects reflects an abnormal learning of new skills within the schizophrenia spectrum.As mentioned by the American Academy of Clinical Neuropsychology (AACN), "There is an obvious need for more data on normal change trajectories for all types of measures with all types of demographic variables and patient groups 38 ".
On the other hand, it remains unclear whether antipsychotic medications have an acute impact on practice effects.The effects of practice mentioned above were observed between testing sessions that were often separated by several days, weeks, or even months.However, today, we know that antipsychotic medications can improve certain clinical symptoms of patients much faster.For instance, Agid et al. 39 found a significant dose-related effect (20 mg vs. 2 mg) on the early psychosis factor and scores of a positive symptom subscale 4 hours after intramuscular (IM) injection of ziprasidone.Patients with schizophrenia taking 10 mg of olanzapine IM showed overall relief on the psychosis factor of the Brief Psychiatric Rating Scale (BPRS) 2 hours after 40 .Furthermore, even only one hour after the first injection, a reduction in the Excited Component of the Positive and Negative Syndrome Scale (PANSS-EC) scores was evident in the group having 10 mg olanzapine IM [41][42][43][44] and in the group having the oral disintegrating tablet 45 .In fact, such early clinical effects are not surprising.Previous studies have reported that a single dose of 400-450 mg quetiapine gives rise to transiently high (58-64%) striatal dopamine D2 occupancy 2 to 3 hours after its intake 46 .Nevertheless, the speed at which an intake of antipsychotics acts on cognitive deficits has not been measured yet.Given the pivotal role that learning ability plays in the social rehabilitation of patients, it could be useful to investigate practice effects within a short time frame first, such as 90 minutes.The choice of duration is close to the maximum plasma concentration reached after the intake of one pill for most antipsychotics 47 and short enough to be usable in clinical practice.Together with a rapid decrease in clinical symptoms, acute cognitive improvements might predict the efficacy of the medication for the psychosocial rehabilitation of a patient in the long run.
Bizarre and aberrant semantic processing is one of the central features of schizophrenia.It can be assessed by semantic priming paradigms, such as lexical decision tasks (LDTs), which are used in schizophrenia spectrum research.In these tasks, participants are presented with strings of letters (e.g., toble) and are required to decide whether or not each string is a real English word.Semantics are manipulated by presenting a related prime word (e.g., chair) or an unrelated one (e.g., car) before each target word (e.g., table) [48][49][50][51] .A variety of studies have confirmed a semantic priming effect.The time taken to make the lexical decision, or reaction time (RT), is shorter for target words preceded by such related words 52 .In people with schizophrenia attributes (SzAs, i.e., schizophrenia patients and subclinical people with schizotypal traits), this RT priming is smaller than in healthy controls with low schizotypal traits when the time between the onset of the priming word and that of the string of letter is longer than 500 ms 53 .On the other hand, the N400 event-related brain potential (ERP) has also been found to depend on semantic priming.This ERP has a negative-going electrical polarity and a maximum voltage of around 400 ms after the onset of the stimulus [54][55][56] .Like RTs, N400 amplitudes are smaller for target words preceded by a related word than for target words preceded by an unrelated word, which is usually interpreted as indexing easier processing of semantic information 54 .Researchers have investigated N400 priming impairments in people with SzAs and showed that their N400 amplitudes in response to unprimed targets were generally a bit smaller, while in response to primed targets, they were somewhat larger than those of healthy controls, resulting in reduced N400 effects [57][58][59][60][61] .N400 semantic priming deficits have been shown to predict worse symptomatic and functional outcomes after one 62 and two years 63 .While abnormalities in other electrophysiological indexes, such as the mismatch negativity, P3a, and auditory steady-state response, have been observed in schizophrenia patients 64 , their applicability to study practice effects is limited because they only reflect automatic pre-attentive functions.
Both RTs and N400 amplitudes have already been used to measure practice effects.In healthy participants, significant effects of practice and priming on both measures were found over a 3-month test-to-retest delay 51 .Interestingly, at least two other studies also report different effects of practice on the effects of priming.According to the results of Besche-Richard et al. 48, practice does not change priming effects in healthy participants.In schizophrenia patients, the behavioral semantic priming remained impaired, whereas their smaller N400 priming effect and their clinical symptoms were found to be significantly improved at their one-year retest session.However, Kiang et al. 50reported that in healthy participants, the amplitude of the N400 semantic priming effect decreased by about 1.22 µV from the test to the retest session spaced one week apart.A similar, albeit non-significant (likely due to a small sample size), effect of practice on priming effects was also found on RTs.
To address some of the discrepancies across the studies mentioned above, a particular semantic categorization task 65 was chosen to measure practice effects on RTs and N400 amplitudes for the present study.In this task, the question-word "ANIMAL?" is systematically presented at the beginning of each trial, reminding of the task instruction.It is followed by an exemplar (e.g., dog) or a non-exemplar word (e.g., table) of the animal semantic category.Participants have to decide whether or not the target word belongs to this category.This task was first chosen because it uses language stimuli, which are the types of stimuli patients frequently encounter when interacting with others or when receiving instructions at the workplace.It was also chosen because it focuses directly on the meaning of the stimuli rather than on judging whether or not a string of letters is a real word.Indeed, it is the understanding of meanings that is of critical importance for rehabilitation.Focusing on such semantics also yields more robust N400 effects 66 , which enhances the reproducibility of results.In addition, like LDTs, this semantic task enables the recording of both the behavioral responses and brain activity, which allows for the identification of the stages of processing that undergo changes and those that do not change with practice in a patient.This task was also chosen because its difficulty is moderate, with error rates smaller than 10% in healthy participants.Using such low-difficulty level tasks prevents uncertainty about response accuracy and reduces the potential for disengagement and/or shifts in cognitive strategies used during the task.Finally, presenting an instruction word with each trial refreshes participants' working memory (WM) and effectively circumvents WM deficiencies observed in people with SzAs 67,68 .
In summary, our end goal is to explore the potential utility of practice effect measures, an important type of learning ability, within a short timeframe by using a particular semantic categorization task to detect and quantify the rapid effects of medication on these measures.Indeed, these rapid effects could potentially predict the therapeutic efficacy of an antipsychotic on the psychosocial rehabilitation of a patient with schizophrenia in the long term.To achieve this end goal, it is necessary to first evaluate these practice effects and the reliability of these measures in subclinical individuals according to their schizotypal traits.This is the aim of the present work.To ensure that the test remains short enough for easy use with patients in clinical settings, a short time interval (i.e., 90 minutes) was used between the first session (referred to as the study session) and the second session (referred to as the test session).

Questionnaires
The SPQ scores of our participants covered a relatively wide range of the continuum between low and high schizotypy, namely, from 0 to 38 (out of 74, the maximal score).The mean of the total SPQ score of all participants was 17.4 (SD = 11.0).High-and low-schizotypy subgroups did not significantly differ in terms of sex, age, and level of education (see Table 1).There was a SPQ x session interaction on the level of anxiety (F (1, 45) = 10.4,p = 0.002, ηp 2 = 0.19).It was further explored by pairwise comparisons, which revealed that the mean anxiety level of the high SPQ subgroup (mean = 28.4,SD = 17.3) was significantly higher than that of the low SPQ subgroup (mean = 15.0,SD = 13.9)before the start of the experiment.STAI scores of the two groups significantly increased along with the experiment.This increase was larger for the low-than for the high-SPQ subgroup, so there was no significant difference between the mean anxiety level of the low SPQ subgroup (mean = 59.4,SD = 4.5) and that of the high SPQ subgroup (mean = 56.6,SD = 5.7) after the experiment.On the contrary, fatigue did not significantly increase during the experiment.Fatigue of the high SPQ subgroup (mean = 38.2,SD = 17.0) was significantly higher than that of the low SPQ group (mean = 21.9,SD = 17.6), both before and after the experiment (F (1, 45) = 11.2, p = 0.002, ηp 2 = 0.20).
Table 1.Demographic characteristics of participants.We then focused on each category condition to identify the source of the session x SPQ x electrode interaction.There was a SPQ x session interaction in the exemplar condition (F (1, 45) = 6.4,p = 0.015, ηp 2 = 0.12).Post-hoc ANOVAs revealed that N400 amplitudes in session 2 (− 0.3 µV) were significantly smaller than in session 1 (− 1.5 µV) only in the exemplar condition and only for participants with low SPQ scores (Fig. 2).This was not the case for participants in the high SPQ subgroup.There was a marginally significant effect of session on the N400 effect, (F (1, 45) = 4.1, p = 0.05, ηp 2 = 0.08).The N400 effect of session 2 (− 1.6 µV) was a bit larger than that of session 1 (− 1.0 µV) (see Fig. 3).There was neither an effect of SPQ on the N400 effect nor any interaction including this factor.ERP figures for raw N400 amplitudes for the exemplar and non-exemplar conditions are provided in the Supplementary Materials.

Behavioral data
To test whether behavioral data obtained in this particular semantic task were reliable, we examined whether participants who had, for instance, faster mean reaction times at session 1 relative to other participants also had faster mean reaction times at session 2 relative to other participants.As illustrated in Fig. 4  The reliability of N400 amplitudes was assessed at Cz and for a central cluster of 11 electrodes (Fc3/4, C3/4, Cp3/4, P3/4, Pz, Cz, and Fcz), where N400 effects obtained with written word stimuli are usually found to be maximal.We also examined whether participants had similar scores from different subsets of trials (i.e., internal consistency reliability).Table 2 displays the internal consistency reliability (ICR) and the test-retest reliability (TRR) found.In general, the mean N400 amplitudes of both non-exemplar and exemplar conditions had good reliability.The mean reliability of N400 amplitudes of the exemplar condition was better than that of  www.nature.com/scientificreports/ the non-exemplar condition at Cz and the central cluster.Figure 5 shows scatterplots of correlations between session 1 and session 2 for N400 amplitudes at Cz.

Correlations between N400 amplitudes and SPQ
We did not find a strong correlation between SPQ scores and N400 amplitudes (most Pearson's r were smaller than 0.3).Only one significant correlation was found between SPQ Interpersonal scores and N400 effects of session 2 at Pz (Pearson's r = 0.31, p = 0.016) (Fig. 6), possibly because the highest SPQ score in our sample was not very high compared to those seen in schizophrenia patients 24 .

Discussion
The present study is an initial step towards evaluating a task that could enable a rapid selection of the most effective antipsychotic medication to improve the practice effect of a patient.Indeed, these improvements have the potential to facilitate his/her rehabilitation and subsequently ameliorate his/her psychosocial outcomes.We conducted a study involving 47 healthy individuals who were tested twice in a particular semantic categorization   www.nature.com/scientificreports/task, taking into account their schizotypal traits.Practice effects were observed over the course of 90 minutes separating the two sessions of this task, indicated by a significant reduction in RTs from session 1 to session 2. Secondly, good test-retest reliability was observed, and the test was found to be sensitive to the mild-to-moderate schizotypal traits of our subclinical participants.A high response accuracy was observed, indicating that participants were attending to the stimuli and performing the task.In addition, as mentioned, practice effects were observed.Participants responded 50 ms faster in session 2 than in session 1, consistent with the results found in three previous studies: in Kiang et al. 50, who reported a 30 ms decrease over a one-week interval; in Besche-Richard et al. 48, who found a 48 ms decrease over a one-year interval, and in Yu et al. 51 , who found a 45 ms RT reduction in a lexical decision task at the retest session 3 months later.
RTs and N400 measures had good test-retest reliability, which is consistent with previous studies [49][50][51] .The test-retest reliability for the N400 at Cz was 0.69 for the non-exemplar and 0.80 for the exemplar conditions, and the internal consistency reliability was 0.88 for the two conditions.In other studies, the reliability of N400 amplitudes at Cz has been reported as 0.69 for unrelated and 0.74 for related targets within a one-week interval 50 ; 0.75 for unrelated and 0.55 for related targets within a three-month interval 51 .The high test-retest reliability of N400 amplitudes found in healthy controls in the present study is comparable to, or better than, other putative ERP biomarkers, like the amplitudes of P300 and that of the mismatch negativity 70,71 .The present results also further support the N400 as a potential bioindicator in longitudinal treatment studies of the schizophrenia spectrum.However, we did not find retest reliability on the N400 priming effect.One possible reason is the relatively small proportion of related prime-target pairs (RP) employed in this study, at approximately 33.3%.Notably, Stolz et al. 72 reported that RP and reliability are inversely proportional.
The N400 semantic priming effect of session 2 was 0.5 µV larger than that of session 1.This effect of practice on the N400 semantic priming effect did not interact with the SPQ group.This finding is inconsistent with previous studies.For example, Besche-Richard et al. 48reported that, in healthy participants, practice does not change semantic priming effects at their one-year retest session.Kiang et al. 50found that the N400 semantic priming effect decreased by about 1.22 µV from the test to the retest session split one week apart.One possible explanation for these discrepancies is that in the present study, the interval between the two sessions was 90 minutes, which is much shorter than in other studies, that is, one year and one week, respectively.Barnett et al. 73 proposed that the size of the practice effect is inversely proportional to the interval between the test and retest sessions.
Secondly, we found that N400 amplitudes for the exemplar condition decreased significantly with practice in participants with low SPQ scores, but not in participants with high SPQ scores.Interestingly, this difference between SPQ subgroups was not reflected in the RTs, suggesting that N400s may index processes other than those indexed by RTs.As a matter of fact, in tasks like the present one, where participants had to provide a response as fast as possible (in addition to being as accurate as they could), RTs mainly depend on activations, as repeatedly found when testing the so-called race models of RTs 74 .These models account for RTs obtained, for instance, in divided attention tasks, when two target stimuli are presented simultaneously.There, participants respond faster than when only one stimulus is presented.According to race models, the two target stimuli are processed independently.RTs are then determined by the stimulus that triggers or activates the response first: the winner of the race.This could also be the case here, as our word stimuli (e.g., dog) activate more than one meaning corresponding to a response (e.g., the meaning of pet and the meaning of mammal).The meaning that is processed the fastest will be the one activating the response first.In contrast to RTs, N400s might index inhibitory mechanisms [75][76][77] .The decrease of N400 amplitudes observed with practice in participants with low SPQ could be a consequence of a decrease in the activation of inappropriate representations by the question-word "ANIMAL?".Less inhibition would then be necessary during the processing of exemplar target words, hence, the reduced N400 amplitudes would be found in these conditions.If this were the case, it would mean that one of the effects of practice is the development of more focused processing and that such improvements are compromised by a higher degree of schizotypal traits.It would also mean that a notable inhibition remains to be performed at the occurrence of the target word in participants with higher SPQ scores.
Besche-Richard et al. 's results 48 can be used to further support a difference between RTs and N400s.At a retest session one year later, they found that symptom-relieved schizophrenia patients still had deficits in priming effects on RTs, whereas their priming effects on N400 significantly improved.These findings also suggest that the effect of practice on N400s might be a better predictor of clinical improvement than the effects of practice on RTs.This claim can be reinforced by the overall slower RTs usually exhibited by patients with schizophrenia, responsible for an inflation of the relative differences in RTs between non-exemplar and exemplar conditions, which can sometimes lead to spurious behavioral results 78 .
One potential limitation of this study could be the repetition effect.Indeed, the task included identical stimuli across the two consecutive sessions.The observed practice effects could thus be partly attributed to repetition effects.This repetition is very well-known to cause a robust decrease in the amplitudes of the raw N400 and of the N400 effects that are obtained in lexical decisions [79][80][81] .Surprisingly, such a robust N400 decrease was neither observed for non-exemplar words nor for both types of words in the high SPQ group.What was observed here was a moderate N400 decrease only for exemplar words and only in the low SPQ subgroup.This absence of a robust and systematic effect of repetition on the N400 could be due to the other task that participants had to perform between the semantic task of session 1 and the semantic task of session 2. This task used words other than those used in the semantic task.It also pertained to the meaning of these words, but in a very different way.This intervening task could have prevented the classical robust and systematic effects of repetition on the N400s.
Classically, the repetition of stimuli also induces a decrease in RTs 56,82,83 .The faster RTs observed in session 2 than in session 1 could thus be due to such effects.Nevertheless, this possibility is unlikely because the three previous aforementioned studies reported RT reductions of a similar magnitude to the one reported here, despite employing much longer intervals between the two sessions.Repetition effects on RTs tend to be more pronounced www.nature.com/scientificreports/with shorter delays [84][85][86] .Consequently, if the observed RT decrease was solely attributable to stimulus repetition, it would have been substantially larger than the 50.4 ms that we reported.Instead, it is more plausible that this RT decrease is primarily a result of the pure practice effects associated with the task itself.This claim is further supported by the enduring nature of procedural memory, which remains almost unchanged over the years 87,88 .This could account for the fact that the sizes of the RT reductions observed in our study are similar in magnitude to those in the three previous studies.The level of participants' anxiety should be considered when interpreting the results.The high SPQ subgroup exhibited significantly higher anxiety levels compared to the low SPQ subgroup before the experiment.This is consistent with previous research showing that the SPQ score is positively correlated with trait anxiety 89 .Participants' anxiety could have thus contributed to the lack of practice effect in the N400 amplitudes among high SPQ scorers in the exemplar condition.Indeed, this high level of pre-experiment anxiety may partially impede flexibility and the ability to maintain and develop focus during practice 90 .A meta-analysis of 177 studies has shown that self-reported measures of anxiety are reliably associated with poorer performance on measures of working memory capacity 91 .Anxiety-related behavioral phenotypes seem to be related to disrupted prefrontal cortex neural activity in healthy individuals 92,93 .Anxiety levels of both groups increased during the experiment.Some other electroencephalography (EEG) studies also reported increased levels of anxiety at the end of the experiment 94,95 .This be due to the novelty of the experience, concerns about the expected outcome, the discomfort of wearing electrode caps, the need to control blinks and eye movements, etc.
Next, a normative sample of schizophrenia patients should be tested longitudinally to know whether the initial change of practice effects induced by the chosen medication predicts the long-term (e.g., a year) functional outcome of a patient.In addition, the study of such effects may be critical not only in clinical trials but also in all studies using placebo controls.In fact, improvements in patients receiving placebos are likely to be at least partly due to practice effects and not only to the expectation bias inherent to placebo-controlled designs 31 .Future studies should thus compare practice effects in patients receiving a placebo to those of individuals who are not receiving anything.Other studies investigating the impact of antipsychotic medications (e.g., olanzapine and risperidone) on practice effects could then control for the placebo effect of the studied medication.This would be a more rigorous way to see whether the impact of medication on practice effects predicts patients' psychosocial rehabilitation, and whether such effects can thus be used to select what would be considered the best medication for a particular patient.

Methodology Participants
Were selected among candidates who answered our English or French online advertisements in a variety of social media (e.g., Kijiji, Facebook, and the McGill Classified Ads website).Participants had to be native English or French speakers with at least ten years of education in either language.The sample size was calculated by power analysis using G*Power 3.1.We referred to previous studies where the effect size of the practice effect on RTs was around 0.33 51 .Calculations showed that a minimum of 22 participants per group was necessary to achieve 80% power.To ensure replicability, we used 47 participants (45 Anglophones and 2 Francophones).All participants were right-handed and had no previous history of a neurological condition, no medical condition that compromises brain function, and no head injury with a loss of consciousness longer than 5 minutes.We also excluded those with a personal history of DSM-IV Axis I psychiatric disorder, a family history of schizophrenia or bipolar disorder, alcohol or substance abuse disorder, or current use of a psychotropic medication.47 healthy participants between the ages of 18 and 30 (mean = 23.1,SD = 3.1, 21 females) were retained.All participants provided written informed consent form prior to participation.This study was approved by the Douglas Ethics Review Board (project number: IUSMD-06-42).All research was performed in accordance with relevant guidelines/regulations and the Declaration of Helsinki.
Psychometric scales Before the EEG recording session, participants also completed a set of questionnaires in their preferred language (English or French).This set included the State-Trait Anxiety Inventory Form Y (STAI-Y State) 96 and a self-created questionnaire evaluating fatigue (see Supplementary Materials).STAI-Y has highreliability coefficients of 0.92 for the State and 0.90 for the Trait scales (Form Y), respectively 97 .The questionnaire set also included the Schizotypal Personality Questionnaire (SPQ), which has high internal reliability (coefficient alpha > 0.90) and test-retest reliability (r = 0.82) 98 .This questionnaire was initially designed to measure the severity of schizotypal personality traits in the general population.It has been widely used as a psychometric tool in research 99 .Nevertheless, it is based on the DSM-III-R criteria used to diagnose schizophrenia and is also used with schizophrenia patients [24][25][26] .The STAI-Y and fatigue questionnaires had to be completed twice, that is, once before and once after the experiment, in order to evaluate the effect of sessions on fatigue and anxiety as potential confounding factors.In contrast, SPQ was administered only once before the EEG session, as SPQ total scores remained relatively stable over time 100 .

Procedure
Upon arrival, participants completed a demographics questionnaire, where they provided information regarding their sex, age, and level of education.Immediately after completing the set of questionnaires, the EEG recording session began, during which participants provided responses in the semantic categorization task (session 1) for about 15 minutes.Right after, for approximately 15 minutes, participants had to perform another task using word stimuli and focusing on their meaning.However, that task was peripheral to the objectives of the present study and is therefore not discussed.A one-hour lunch break was then given to all participants.It was followed by session 2, where the target words of the semantic task remained the same as those used in session 1. www.nature.com/scientificreports/

Stimuli
The semantic task was identical to the one used in Debruille et al. 69 , which has both an English and a French version.Participants completed tasks in their preferred language, English or French.It included 180 trials, each made of two serially presented words.In two-thirds of the trials, the first word was the question-word "ANI-MAL?",followed by an exemplar (e.g., dog) or a non-exemplar (e.g., table) of the animal category.Participants had to decide if the target word belonged to the animal category as accurately and rapidly as possible by pressing a "Yes" for matching targets or a "No" key for mismatching targets with their right index finger.Exemplar and non-exemplar words were matched for the number of letters and frequency of usage using the Content et al. 101 database for the French words and the Kucera et al. 102 counts for the English words.In one-third of the trials, the first word was "INACTION", which meant that participants should not respond to the second stimulus of the trial.The priming word "ANIMAL?" (or "INACTION") appeared in the center of the screen for 500 ms and was replaced by a fixation cross for 500 ms.This cross was followed by the target word for 1000 ms or until a valid keypress occurred (Fig. 7).Such a long stimulus onset asynchrony (SOA), that is, such a long time between the onset of the prime and the onset of the target word, was chosen because it allows observing robust N400 priming deficits in schizophrenia patients [103][104][105] .Moreover, long SOAs have been proven to boost test-retest reliability 72,106 .Each target word was then replaced, 1.5-2 s later, by the word "Blink", which lasted for 500 ms.

Data acquisition
The time taken to make the exemplar/non-exemplar decision (i.e., the RT) was recorded at each trial.The EEG was recorded from 28 tin electrodes mounted on an Electro-Cap International (ECI) cap.The following sites of the international 10-20 system were used: Fp1/2, F3/4, Fc3/4, C3/4, Cp3/4, P3/4, O1/2, Fz, Fcz, Cz, Pz, F7/8, Ft7/8, T3/4, Tp7/8, and T5/6.The right earlobe was used as the reference, and the ground was placed 2 cm anterior to Fz.The impedance was measured before the experiment using a 30 Hz current and was kept below 5 KΩ.An electronic notch filter was used to reduce the 50 Hz EM noise coming from power lines.The highand low-pass filters had their half amplitude cut-off set at 0.01 Hz and 100 Hz, respectively.EEG signals were digitized at a 248 Hz sampling rate.

Data processing and measures
The data were processed in MATLAB using the EEGLAB toolbox with the ERPLAB extension.An independent component analysis (ICA) was first used to identify and remove artifactual components, such as blinks, eye movements, and myograms.The infomax algorithm ICA was performed on a copy of the continuous EEG that was high-pass filtered at 1 Hz and low-pass filtered at 30 Hz.The resulting ICA weight matrix and sphering matrix were then applied to the continuous 0.1-30 Hz filtered EEG by using the ICLabel EEGLAB extension to signal artifactual independent components (ICs) that had > 20% of chance of being muscle activity or > 8% of chance of being eye activity 107 .These ICs were then systematically subtracted from this continuous 0.1-30 Hz filtered EEG, as in Finke et al. 108 , Goregliad et al. 109 , and Markey et al. 110 .
Only trials including behavioral responses performed between 300 and 2500 ms post-onset were retained.This was done to eliminate trials where participants did not pay enough attention, were too hesitant, or, on the contrary, provided responses that were too prompt, revealing an absence of real stimulus evaluation.The EEG epochs of those trials were taken from 200 ms pre-stimulus to 1000 ms post-stimulus.Their baselines were set by computing the mean voltages in the − 200 to 0 ms time window for each electrode and by subtracting this mean value from each point of the − 200 to 1000 ms epoch.Trials with voltages exceeding the amplitude range of −/+100 μV for the 4 frontal electrodes (Fp1/2, F7/8) and of − /+75 μV for the remaining 24 electrodes were rejected, as well as epochs containing flat lines persisting for more than 100 ms.Participants were included in www.nature.com/scientificreports/correlation coefficients (ICCs) and Pearson's r correlation coefficients between session 1 and session 2 were calculated.A "Two-Way mixed effects model" and "absolute agreement" were used in the ICCs calculation 51,112 .All statistical analysis were performed with IBM SPSS Statistics (version 27).The Greenhouse and Geisser's adjustment of the degrees of freedom was used to compensate for the heterogeneity of variances across electrodes.In those cases, the original F-values and degrees of freedom are provided together with the corrected p-values.The Benjamini-Hochberg false discovery rate (B-H FDR) procedure was then used to evaluate the significance of each p-value of each series of tests.P-values were thus first ranked from the most to the least significant.One B-H FDR threshold for each of these p-values was then computed by dividing its rank by the total number of tests and by multiplying the result by the false discovery rate chosen (i.e., 10%).The p-value was declared significant if it was smaller than that threshold.

Figure 1 .
Figure 1.Mean reaction times on non-exemplar and exemplar conditions in session 1 and session 2 (N = 47).Error bars display the standard error.*** are for p < 0.001.

Figure 2 .
Figure 2. Spline interpolated isovoltage scalp maps illustrating (1) the effect of practice on the N400 amplitudes of the exemplar condition in participants with low SPQ scores (N = 23) and (2) the absence of such an effect in the high SPQ (N = 24) subgroup.The values coded by the map colors correspond to the results of the subtraction of the mean ERP voltages of session 1 from those of session 2 in the N400 time window (300 ms -500 ms).* are for 0.05 > p > 0.01.** are for 0.01 > p > 0.001.

Figure 3 .
Figure 3. Illustrating the larger N400 effects in session 2 than in session 1. N400 effects were obtained by subtracting the ERPs of the exemplar condition from the ERPs of the non-exemplar condition.Those of session 1 are the dark red lines for the low SPQ subgroup (N = 23) and the light red lines for the high SPQ subgroup (N = 24).Those of session 2 are the dark blue lines for the low SPQ subgroup and the light blue lines for the high SPQ subgroup.

Figure 4 .
Figure 4. Illustrating the reliability of behavioral data across the two sessions by plotting the mean RTs of session 1 as the x coordinate of each disk/square-participant and the mean RTs of session 2 as its y coordinate (N = 47).

Figure 5 .
Figure 5. Illustrating the reliability of N400 amplitudes at Cz across the two sessions by plotting the N400 mean voltages of session 1 as the x coordinate of each circle-participant together with N400 of session 2, the y coordinate of each circle-participant (N = 47).

Figure 7 .
Figure 7.The procedure of the semantic categorization task.